Skip to main content

Runbook: Slack Notification – Empty Scraping Result

ServiceReview Management x Tripla Review – Webscraper
Owner Team slack handle@bnl-dev-bali
Team's Slack Channel#bnl-teams-b
Alert Channel#bnl-products-alerts

Table of Contents

  • [[#Important Links]]
  • [[#1. Triage]]
  • [[#2. Decision Point]]
  • [[#3. False Alarm]]
  • [[#4. True Incident]]
    • [[#4.1. Recover the System]]
    • [[#4.2. Clean up]]

AlertSlack Notification – Empty Scraping Result
Webscraper DashboardWebscraper Dashboard URL
Reviewku PortalReviewku Portal URL
Reviewku Worker LogsWorker Service Logs URL
Review API Endpointhttps://review-api.bookandlink.com/h/ws/notif

1. Triage

Goal: Determine whether scraping failed due to captcha blocking, invalid start URL, or webhook/queue failure.


Step A - Validate Scraping Job

  1. Login to Webscraper dashboard.
  2. Go to Jobs menu.
  3. Type Property ID in search field.
  4. Click Inspect.
  5. Open Details tab.

Check if any of these values are greater than 0:

  • Failed pages
  • Empty pages
  • No value pages

If any value > 0 → scraping issue confirmed.


Step B - Identify Specific Page Issue

If Failed Pages > 0:

  • Click Failed Pages tab.
  • Inspect culprit page.
  • Most common cause: Captcha blocking.

If Empty Pages > 0 or No Value Pages > 0:

  • Click respective tab.
  • Click Preview.
  • Check HTTP response (example: 404).

Step C - Validate Queue Processing

Go to Reviewku Worker Service Logs.

Search by Property ID.

If you find:

Message received: map[channel:traveloka event:download-scraped-data ...]

SUCCESS: Successfully inserted X reviews with property ID Y into the main table.

Then scraping result was processed successfully.

If no log found → webhook or queue issue.


2. Decision Point

  • IF Failed Pages > 0 and captcha detected...

    • ➡️ **Go to: [[#4. True Incident]]
  • IF Empty Pages caused by 404 or invalid channel URL...

    • ➡️ **Go to: [[#4. True Incident]]
  • IF Webhook not delivered or no worker logs found...

    • ➡️ Go to: [[#4. True Incident]]
  • IF scraping valid and worker inserted successfully...

    • ➡️ **Go to: [[#4. True Incident]]

3. False Alarm

If:

  • Scraping values are 0
  • Worker logs show successful insertion
  • No new scraping errors

Then Slack alert may be outdated or previously resolved.

Actions:

  1. Refresh Webscraper job.
  2. Re-check review count.
  3. Monitor for 15 minutes.

Post in Slack:

Empty Scraping Result alert reviewed.
Scraping and queue processing verified.
No active issue detected.

4. True Incident

Scraping job failed or result not processed.

Primary objective: Restore successful scraping and review insertion.


4.1. Recover the System

Case 1 - Captcha Blocking

Diagnostic Steps

  • Failed Pages > 0
  • Page preview shows captcha challenge

Remediation Plan

  1. Click Inspect.
  2. Go to Continue tab.
  3. Change proxy.
  4. Click Continue Scraping.

Recommended proxies:

  • US
  • Indonesia
  • Australia
  • Japan

Switch between these if needed.

If none work:

  • Try nearest alternative proxy once.
  • If still failed → report to Webscraper team.

Verification

  • Failed Pages = 0
  • Data preview contains valid reviews.

Case 2 - Empty Pages / Invalid Channel URL

Diagnostic Steps

  • Empty Pages > 0
  • Preview shows 404 or invalid page

Remediation Plan

  1. Go to official channel website (example: Expedia).
  2. Search property manually.
  3. Copy correct official property URL.

Example:

https://www.expedia.com.ph/Manila-Hotels-Red-Planet-Manila-Amorsolo
  1. Login to Portal.
  2. Navigate:
    Reviewku → Properties → Edit
  3. Under Channels, click settings (gear).
  4. Update channel URL.
  5. Click Save.
  6. Click Fetch New under respective channel tab.

Verification

  • New scraping job created.
  • Data preview valid.
  • Failed pages = 0.

Case 3 - Webhook Not Delivered

Diagnostic Steps

  • No SUCCESS log in Worker service.
  • Webhook logs show non-200 status.

Remediation Plan

Re-trigger webhook manually:

curl -X POST 'https://review-api.bookandlink.com/h/ws/notif' \
-A 'webscraper.io/v1' \
-H 'Content-Type: application/x-www-form-urlencoded' \
-H 'Signature: faebe19220e9cb22d40e9280e4083ae4f1749dc9a84267fbadd44096cf88cbbc' \
-d 'scrapingjob_id={scrapingJobID}&status=finished&sitemap_id={sitemapID}&sitemap_name={sitemapName}&custom_id={channelName}-{propID}-production'

Values found at:

Jobs → Scraping Job Details → Detail tab

  • scrapingJobID
  • sitemapID
  • sitemap_name

Expected response:

{"action taken":"Processing Data","custom id":"expedia-64-production","job id":"39140478","site map id":"953592","status":"finished"}

Verification

  • Worker logs show successful insertion.
  • Review count updated.

4.2. Clean up

  1. Confirm review count updated in Reviewku dashboard.
  2. Confirm no duplicate insertions.
  3. Monitor worker logs for 30 minutes.
  4. Post Slack resolution update:
Empty Scraping Result resolved.
Scraping job reprocessed successfully.
Review synchronization restored.
  1. If captcha recurring frequently, escalate to Webscraper team for long-term mitigation.